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ABSTRACT 
In  this  study  time  series  analysis  is  applied  to  the  problem  of 
forecasting  state  income  tax  receipts.   An  objective  criterion  deve- 
loped by  Hannon  and  Ouinn  (1979)  is  applied  to  identify  the  model  and 
a  Box-Cox  (1964)  transformation  is  used  to  select  between  the  log  and 
linear  versions  of  the  model.   Out-of-sample  forecasts  from  the  model 
are  compared  to  forecasts  obtained  from  an  econometric  model.   The 
time  series  model  consistently  outperforms  the  econometric  model  in 
forecasting  state  tax  receipts  according  to  the  percentage  root  mean 
square  error  test.   The  study  establishes  time  series  analysis  as  a 
viable  technique  for  forecasting  state  tax  receipts. 


I.   Introduction 

The  growing  consequence  of  state  income  tax  revenues  makes  their 
accurate  forecasting  important.   The  techniques  currently  used  include 
both  single  equation  econometric  models  such  as  those  of  Singer  (1968) 
and  Greytak  and  Thursby  (1980)  and  simultaneous  equation  models  such 
as  one  developed  by  Auten  and  Robb  (1976).   The  data  may  be  either 
quarterly  or  annual  and  explanatory  variables  often  include  income, 
population,  income  per  capita,  and  the  tax  rate.   The  purpose  of  this 
paper  is  to  demonstrate  the  application  of  another  forecasting  tech- 
nique, time  series  analysis,  to  the  problem  of  forecasting  state  tax 
revenues. 

Time  series  analysis  offers  some  advantages  over  econometric  fore- 
casting.  It  has  the  advantage  of  not  requiring  a  large  amount  of  data 
and  it  can  handle  seasonal  fluctuations  better  than  the  regression 
technique.   The  disadvantage  is  that  its  application  requires  an 
experienced  researcher  to  identify  the  model.   Forecasting  with  a  time 
series  model  is  often  more  like  an  art  than  a  science.   We  propose  to 
reduce  the  subjectivity  associated  with  time  series  analysis  by 
employing  an  objective  criterion  for  identifying  the  model.   We  also 
apply  the  Box-Cox  (1964)  transformation  to  discriminate  between  the 
linear  and  the  log  form  of  the  model. 

The  data  used  in  this  study  are  for  the  state  of  Illinois  but  the 
techniques  are  readily  applicable  to  that  of  other  states.   The  data 
are  quarterly  Illinois  income  tax  receipts  for  the  period  1970  I 
through  1980  IV.   The  model  is  used  to  forecast  over  the  period  1981  I 
through  1983  II.   The  forecasts  are  evaluated  using  the  percentage 


-2- 

root  mean  square  error  for  the  out-of-sample  data  and  are  compared  to 
forecasts  obtained  from  an  econometric  model  relating  income  tax 
receipts  to  personal  income.   Our  results  show  that  a  properly  spe- 
cified time  series  model  outperforms  a  single  equation  econometric 
model  according  to  the  percentage  mean  square  error  loss  function. 

In  Section  II,  the  evaluation  criterion  used  in  the  study  is 
explained  and  justified.   In  Section  III,  the  results  of  employing  the 
evaluation  criterion  to  select  the  "best"  model  are  presented  and  in 
Section  IV,  the  time  series  forecasts  are  compared  with  those  obtained 
from  an  econometric  forecasting  model.   In  Section  V,  the  results  are 
evaluated. 

II.   Time  Series  Methodology 

As  the  first  step  in  our  investigation,  we  have  to  find  the  "true" 
order  of  the  underlying  linear  and  log  linear  ARMA  (p,q)  process: 

\   '    60  +  Vt-1  +  •"  +  Vt-p  +  £t  +  Vt-l  +  •"  +  Vt-q 
log  Xt  =  B0  +  Bllog  X^  +  ...  +  8p  log  Xt_p  +  et   +  y^^  +  ...  +  yqet.( 

Following  Box  and  Jenkins  (1970)  ,  the  conventional  approach  to  the 
problem  of  model  selection  in  time  series  suggests  that  the  sample 
autocorrelation  and  partial  autocorrelation  functions  should  be 
employed  for  determining  the  order  of  the  generating  process.   Of 
course,  this  procedure  assumes  that  the  mentioned  sample  statistics 
closely  resemble  the  autocorrelation  and  partial  autocorrelation  func- 
tions of  the  unknown  process.   Furthermore,  this  procedure  requires  a 
skillful  researcher  who,  equipped  with  vast  experience,  can  determine 


-3- 

the  order  of  the  process  with  visual  inspection.   Obviously  such  a 
requirement  introduces  undesirable  and,  to  some  extent,  unnecessary 
subjectivity  into  the  inference. 

Due  to  the  above  difficulties  and  some  other  problems  associated 
with  the  identification  stage  of  Box-Jenkins  prescription  for  model 
selection  (see  Newbold,  1983) ,  recently  time  series  analysts  have  con- 
sidered the  Akaike  Information  Criterion  (AIC) ,  the  Baysian  Informa- 
tion Criterion  (BIC)  and  the  Hannan-Quinn  criterion  (HQ)  as  promising 
tools  for  model  identification. 

Akaike  (1973)  used  the  Kullback-Leiber  information  criterion  to 
derive  his  celebrated  criterion: 

AIC  =  -  log  f(X/9)  +  K  (1) 

where  f  is  the  maximum  likelihood  vector  of  the  parameter  vector  9 
based  on  realizations  (observations) ,  and  K  is  the  number  of  para- 
meters to  be  estimated.   The  criterion  suggests  that  from  a  pool  of 
competing  models,  that  model  should  be  chosen  that  minimizes  the  value 
of  AIC.   As  can  readily  be  seen  from  equation  (1) ,  AIC  consists  of  two 
terras,  the  first  term  is  a  measure  of  the  goodness  of  fit  of  the  model 
and  the  second  term  is  a  measure  of  the  price  that  should  be  paid  for 
increasing  the  number  of  parameters.   By  showing  the  existence  of  a 
trade-off  between  the  fit  of  the  model  and  the  number  of  parameters  to 
be  estimated,  Akaike' s  criterion  explicitly  formulates  the  principle 
of  parsimony  which  advocates  the  use  of  the  smallest  possible  number 
of  parameters  in  the  model. 
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While  understanding  the  overly  simplified  explanation  of  AIC  given 
above  is  crucial  to  understanding  the  remainder  of  the  paper,  we  will 
not  employ  AIC  for  identifying  our  model.   This  exclusion  is  due  to 
AIC's  inconsistency  in  estimating  the  order  of  the  process  which  has 
been  pointed  out  by  Shibata  (1976)  for  the  AR  and  Hannan  (1980)  for 
the  ARMA  processes. 

The  inconsistency  of  the  estimates  obtained  from  the  AIC  led  to  the 
development  of  a  new  criterion  introduced  independently  by  Akaike 
(1977),  Reissman  (1978)  and  Schwartz  (1978)  which  is  commonly  known  as 
BIC.   BIC,  which  is  strongly  consistent,  is  given  by: 

BIC  =  -  log  f(X/9)  +  K  log  N  (2) 

where  N  is  the  number  of  independently  repeated  realizations. 
Unfortunately,  the  theoretical  framework  on  which  BIC  is  constructed 
requires  N  to  grow  to  infinity.   This  requirement  reduces  the  attrac- 
tiveness of  BIC  for  choosing  the  "best"  model  when  only  a  small  number 
of  observations  are  available. 

Hannan  and  Ouinn  (1979)  suggested  that  the  expression  logN  in 
(2)  should  be  replaced  by  cloglog  N  (c  >  1).   The  rationale  behind 
this  suggestion  is  that  while  such  a  change  does  not  affect  the  con- 
sistency of  the  estimates,  it  increases  the  rate  of  the  decrease  of 
the  second  terra  as  the  sample  size  gets  larger.   Hence,  similar  to  AIC 
and  BIC,  the  Hannan  and  Quinn  criterion  penalizes  an  increase  in  the 
number  of  parameters.   But  as  the  sample  size  grows  the  penalty 
assigned  by  the  HO  criterion  decreases  faster  than  those  assigned  by 
AIC  or  BIC.   The  HO  criterion  is  given  by 
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HOC  =  -log  f(X/B)  +  cK  log  log  N  (3) 

In  equations  (1)  through  (3)  K  is  the  number  of  parameters  to  be  esti- 
mated and,  hence,  for  the  ARMA  models,  it  is  simply  p  +  q.   It  can  be 
shown  (see  Hopwood,  McKeown  and  Newbold,  1984)  that  the  maximum  like- 
lihood function  can  be  written  as: 

N 
-logf(X|0)  =  -  j   log  a2  +  (X-l)  E  log  X 

t=l 

where  X  is  the  transformation  parameter  such  that  X=l  for  a  linear 
model  and  X=0  for  a  logarithm  model  (see  below) .   Hence  equation  (3) 
may  be  rewritten  as: 

N     "2         N 
HOC  =  -  -   log  a     +  (X-l)  Z   log  X  +  c(p+q)log  log  N         (4) 

t=l 

The  Hannon-Quinn  criterion,  as  specified  by  equation  (4)  with  c=2,  is 
used  below  for  estimating  the  dimension  of  our  models. 

III.   Model  Identification  Results 

Our  data  exhibited  both  a  trend  and  a  seasonal  component.   We 
transformed  the  data  to  a  stationary  series  by  first-order  differencing 
and  then  removed  the  seasonal  component  by  fourth-order  differencing. 
Hence,  our  differenced  data  point  will  be  of  the  form: 

Yt  =  (l-B)(l-B4)Xt 

where  B  is  the  backshift  operator. 

In  our  search  for  the  "best"  linear  and  log  linear  models  we  used 
the  direct  derivation  of  the  likelihood  function  as  provided  by 


-6- 

Hillmar  and  Tiao  (1979)  and  employed  a  program  developed  at  the 

University  of  Wisconsin  by  Tiao  et  al.  (1980)  in  order  to  find  the 
maximum  value  of  the  likelihood  functions.   Then  we  applied  the  HO 

criterion  for  obtaining  the  "best"  linear  and  the  "best"  log  linear 

models.   As  can  be  seen  from  the  results  summarized  in  Table  1,  in 

both  cases  an  ARMA(1,0)  model  was  chosen  as  the  most  appropriate 
specification: 

Yt  =  -1.08644  -  .487563Yj._1  +  e  (5a) 

log  Y  =  -.005338  -  .536118  log  Y  .  +  e  (5b) 

t  t-1    t 

Hence,  by  employing  an  information  criterion  we  have  attained  two  com- 
peting models  from  which  one  should  be  chosen  as  the  better 
specification. 

At  this  point  usually  researchers  compare  the  likelihood  of  the 
two  models  and  based  on  this  comparison  they  choose  the  "better" 
model.   This  approach,  however,  is  deficient  in  some  respects.   The 
first,  and  probably  the  most  significant,  deficiency  stems  from  the 
fact  that  one  model  is  always  chosen  even  when  neither  of  the  models 
may  be  significant  in  describing  the  phenomenon  in  question.   Further- 
more, comparison  of  likelihoods  makes  sense  only  if  we  compare  para- 
meters belonging  to  the  same  parameter  space  (i.e.,  nested  models). 
In  many  cases,  such  as  ours,  the  two  models  have  different  parameters 
in  such  a  way  that  one  model  could  not  be  obtained  by  imposing 
restrictions  on  the  other  model  and,  hence,  they  must  be  regarded  as 
non-nested  models.   In  such  situations  likelihood  comparison  should 
not  be  regarded  as  reliable  as  a  means  of  model  selection. 
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TABLE  1 
Modelling  the  Linear  and  the  Log  Linear  Forms  of  ARMA 


Order 


(0,1 
(0,2 
(0,3 
(0,4 
(0,5 
(1,0 

(1,1 
(1,2 
(1,3 
(1,4 
(2,0 
(2,1 
(2,2 
(2,3 
(3,0 
(3,1 
(3,2 
(4,0 
(4,1 
(5,0 


Linear  Version 

Log  Linear  Version 

HOC 

HQC 

-261.463 

52.675 

-256.263 

55.317 

-251.135 

57.974 

-240.875 

61.683 

-226.748 

68.570 

-275.934 

48.358 

-256.844 

55.097 

-247.181 

58.066 

-234.736 

61.490 

-234.372 

65.007 

-264.098 

53.469 

-252.458 

56.489 

-242.374 

59.269 

-238.089 

66.465 

-258.760 

56.196 

-250.033 

58.865 

-236.632 

61.620 

-253.274 

58.988 

-231.619 

66.091 

-246.123 

62.579 
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In  recent  years  a  number  of  tests  for  comparing  non-nested  models 
have  been  developed  (for  an  excellent  survey  see  McAleer  1982).   These 
tests  include  the  J-test  of  Davidson  and  MacKinnon  (1981),  the  JA  test 
of  Fisher  and  McAleer  (1981),  the  PE  test  of  MacKinnon  et  al.  (1983) 
and  a  test  developed  by  Bera  and  McAleer  (1982).   All  the  mentioned 
tests  are  distributed  as  N(0,1)  under  the  null  hypothesis  in  large 
samples.   In  non-nested  testing  the  null  and  the  alternative  hypoth- 
eses are  compared  to  each  other.   Four  outcomes  are  possible:   (1) 
accept  only  the  null,  (2)  accept  only  the  alternative,  (3)  accept  both 
the  null  and  the  alternative,  (4)  reject  both  the  null  and  the  alter- 
native hypothesis. 

While  the  above  tests  could  help  us  in  testing  model  (5a)  against 
model  (5b)  ,  they  will  not  be  very  useful  in  testing  either  model  (5a) 
or  model  (5b)  by  itself  against  the  data.   For  example,  if  one  of  the 
tests  for  non-nested  models  were  performed  and  the  result  of  the  test 
was  "reject  both  models",  such  a  result  would  only  mean  that  neither 
of  the  specifications  is  of  significance  relative  to  the  other  model. 
It  does  not  follow  that  neither  of  the  models  is  of  significance  by 
itself  in  explaining  the  "reality."   In  other  words,  the  above  tests 
are  appropriate  for  model  testing  and  not  for  model  discrimination. 

Since  here  we  are  interested  in  absolute  ability  of  each  model  in 
explaining  the  "reality"  and  not  in  its  relative  ability,  and  since  we 
are  concerned  with  single  series,  we  propose  using  the  Box-Cox  trans- 
formation instead  of  using  the  available  methods  for  non-nested 
hypothesis  testing.   This  transformation  has  been  employed  for  the 
demand  for  the  money  function  (Zarembka,  1968  and  Spitzer  1976,  1977) , 
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the  consumption  function  (Tsao,  1975),  the  liquidity  trap  (White,  1972) 
and  the  production  function  (Berndt  and  Khaled,  1979) .  Models  (5a)  and 
(5b)  can  be  rewritten  as: 

Yt  =  90  +  9lVl  +  et  (6a) 

log  Yc  =  60  +  31log  Yt_1  +  et  (6b) 

Employing  the  Box-Cox  transformation  yields: 

Yt(X)  =  60  +  eiYt_1(X)  +  et(X)  (7) 

where  e  is  a  consistent  estimator  of  e  and  8.  =  8,,  i  ■  0,1,  when 

i    i 

X  =  1.   Notice  that  if  X  -   1  (6a)  and  (7)  will  be  identical  and  if 
X  =  0  (6b)  will  be  identical  with  (7).   Hence  a  test  for  the  linearity 
of  the  model  is  a  test  for  X  =  1  and  a  test  for  the  log  form  is  a  test 
for  X  -  0. 

We  should  also  note  that  if  the  error  terms  are  normally  dis- 
tributed (as  it  is  assumed  in  this  paper) ,  it  may  be  impossible  to 

have  a  log  model  since  such  a  model  could  result  in  negative  values  of 

1   n 
log  Y  .   If,  however,  Y  =  —  £  Y.  is  several  standard  deviations 
t  t    n  .  ..  l 

1=1 

greater  than  zero,  then  even  extreme  negative  values  for  the  error 
terms  will  not  yield  to  negative  values  for  log  Y  and,  as  result,  the 
assumption  of  Gaussian  error  terms  is  justifiable. 

Estimation  of  X  and  its  standard  deviation  can  be  easily  done  by 
using  the  Box  routine  in  SHAZAM.   Our  investigation  yielded  to  X  =  0.75 
with  the  t-ratio  of  .5468.   This  result  indicates  that  for  1,  5,  or  10 
percent  level  of  significance  the  hypothesis  X  =  1  can  be  accepted  while 
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X  =  0  should  be  rejected.   Therefore  the  linear  form  of  ARMA(1,0) 
should  be  regarded  as  the  "best"  model. 

IV.   Forecast  Evaluation 

We  used  the  following  linear  and  log  models  for  forecasting  the 
out  of  sample  data  of  1981  I  to  1983  II: 


Xt  =  -1.08644  +  Xt_1  +  Xt_4  -  Xt_5  -  .487563(Xt_1-Xt_5-Xt_2+Xt_6)  +  et 

logXt  =  -.005338  +  logXt_1  +  logX^  -  logXt_5 

-  .536118(logXt_1-logXt_5-logXt_2+logXt_6)  +  et 

We  also  used  the  linear  and  the  log  forms  of  a  simple  econometric 
model  which  uses  the  information  available  on  personal  income  at  time 
t  for  predicting  the  tax  revenues  of  the  same  time  period: 


X  =  -116142.01  +  6.155It  +  53270. 25Q1 
(-5.77)    (28.67)    (3.83) 


+  168854.5602  +  2019. 22Q3 
(12.17)       (.14) 


R2  =  .958     D.W.  =  1.729 


logX  =  -.542  +  1.185  logl  +  .113Q1 
Z     (-1.46)   (36.13)  Z        (4.23) 


+  .36902  -  .000503 
(13.86)    (-.02) 


R2  =  .972  D.W.  =  1.612 
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where  X  is  the  income  tax  receipts,  I  is  personal  income  and  Oi , 

2 
i=l,2,3,  represents  seasonal  dummies.    The  percentage  root  mean  square 

error  (RMSE(%))  is  measured  by: 


-  -  A  - 


RMSE(%)  =-/■=■  Z(X~-X") 
-  N    t   t 


S  3. 

where  X  is  simulated  tax  receipts,  and  X  is  actual  tax  receipts  in 

t  v      '      t  v   . 

period  t,  and  X  is  the  mean  of  the  variable.   The  results  are  sum- 
marized in  Table  2.   As  can  be  seen,  both  the  linear  and  log  time 
series  models  outperformed  the  econometric  models  according  to  the 
percentage  root  mean  square  error  criterion.   The  linear  time  series 
model  performed  the  best,  followed  by  the  log  linear  time  series 
model.   Both  had  substantially  smaller  root  mean  square  errors  than 
did  the  econometric  models. 

V.   Conclusions 

In  this  paper  we  applied  some  "objective"  standards  to  the  devel- 
opment of  a  time  series  model  for  forecasting  state  income  tax  receipts 
Using  the  Hannan-Quinn  criterion,  we  first  determined  that  the  linear 
and  log  linear  versions  of  the  ARMA(1,0)  model  are  preferred  over 
other  models.   We  then  used  a  Box-Cox  transformation  to  select  the 
linear  version  of  the  time  series  model. 

When  compared  with  the  forecasts  from  an  econometric  model,  the 
forecasts  obtained  from  the  linear  time  series  model  were  judged  far 
superior  by  the  percentage  root  mean  square  error  criterion.   This  is 
a  significant  finding  since  the  econometric  model  employed  more  infor- 
mation than  did  the  time  series  model.   In  particular,  the  econometric 
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TABLE  2 


Forecasts  and  the  Percentage  Root  Mean  Square  Error 
(RMSE(%))  for  Forecasts  over  the  Period  1981.1-1983.2 

(millions  $) 


Linear         Log         Linear        Log 
Period   Actual   Econometrics   Econometrics   Time  Series   Time  Series 


81.1 

652.378 

728.960 

739.907 

659.185 

661.244 

81.2 

890.722 

865.386 

986.048 

850.514 

851.615 

81.3 

593.101 

718.961 

701.531 

631.295 

619.693 

81.4 

654.392 

727.018 

711.955 

687.441 

599.927 

82.1 

724.400 

774.816 

790.958 

684.115 

678.988 

82.2 

894.554 

905.339 

1043.760 

959.562 

986.778 

82.3 

625.644 

741.550 

724.126 

629.093 

625.273 

82.4 

580.554 

748.290 

733.292 

671.849 

668.769 

83.1 

711.164 

811.870 

832.593 

701.342 

701.390 

83.2 

881.000 

946.314 

1103.497 

802.534 

827.371 

RMSE(%) 

.181 

.173 

.068 

.071 

Ranking 

4 

3 

1 

2 
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model  depended  on  personal  income  for  both  the  in-sample  and  out-of- 
sample  periods  while  the  time  series  model  used  only  tax  receipts 
data.   Hence,  the  data  requirements  for  the  time  series  model  were 
substantially  less  than  those  of  the  econometric  model.   Still,  the 
time  series  model  performed  better  according  to  our  test. 

Forecasting  tax  receipts  with  a  time  series  model  is  not  a  com- 
monly followed  procedure  among  state  tax  forecasters.   One  reason  may 

o 

be  the  novelty  of  the  approach  and  the  difficulty  of  identifying  an 
appropriate  time  series  model.   The  main  contribution  of  this  paper  is 
to  show  that  relatively  easy  to  apply  techniques  are  available  for 
time  series  model  selection  and  that  the  application  of  these  tech- 
niques can  lead  to  a  time-series  model  which  outperforms  a  standard 
econometric  forecasting  model. 
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FOOTNOTES 

The  Illinois  income  tax,  introduced  in  1970,  is  a  flat-rate  tax 
on  individual  and  corporate  income  applied  at  rates  of  2.5  percent  and 
4.0  percent  respectively.   The  state  of  Illinois  recently  enacted  a  20 
percent  temporary  increase  in  the  individual  and  corporate  income 
taxes  to  be  retroactive  to  the  period  1983  I  through  1984  II.   Since 
the  temporary  tax  increase  did  not  impact  state  tax  receipts  until 
1983  III,  the  period  immediately  following  our  sample  data,  the  tem- 
porary tax  increase  was  disregarded  in  this  study. 

2  —2  2 

The  numbers  in  parenthesis  are  t-ratios,  R  is  the  adjusted  R  , 

and 'D.W.  is  the  Durbin-Watson  statistic. 
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